enhancing python SDK test coverage#47360
Conversation
…aning up comments
|
@sdkReviewAgent-2 |
|
/azp run python - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR substantially expands azure-cosmos SDK coverage across sync + async surfaces, focusing on regressions and behavioral contracts around request headers (User-Agent, SessionToken, Content-Length, response headers), query/feed-range behavior, and routing-map provider resiliency. It also includes some docstring/comment cleanups.
Changes:
- Adds new regression/integration tests for
user_agent_overwrite,read_timeoutpropagation, session-token scoping for feed-range/partition-key queries, and UTF-8 decode fallback behavior (including paged iteration). - Extends routing-map provider unit + integration coverage (including 503 + sub-status assertions and concurrency scenarios).
- Adds additional query/feed-range and encoding (emoji) round-trip tests, plus async parity tests for aggregate utilities and response-header contracts.
Reviewed changes
Copilot reviewed 37 out of 37 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/cosmos/azure-cosmos/tests/test_user_agent_overwrite_regression.py | New sync regression tests for user_agent_overwrite combinations and wire User-Agent prefix capture. |
| sdk/cosmos/azure-cosmos/tests/test_user_agent_overwrite_regression_async.py | Async counterpart of the user_agent_overwrite regression suite. |
| sdk/cosmos/azure-cosmos/tests/test_session.py | Adds stronger feed-range session-token scoping tests (value/pagination/partition_key-only). |
| sdk/cosmos/azure-cosmos/tests/test_session_async.py | Async versions of the stronger feed-range session-token scoping tests. |
| sdk/cosmos/azure-cosmos/tests/test_session_token_unit.py | Adds direct unit coverage for set_session_token_header sync/async behavior for single-partition selection. |
| sdk/cosmos/azure-cosmos/tests/test_routing_map_provider_unit.py | Enhances unit coverage and asserts 503 sub-status for routing-map snapshot inconsistency; adds SmartRoutingMapProvider e2e tests. |
| sdk/cosmos/azure-cosmos/tests/test_routing_map_provider_unit_async.py | Async equivalent of routing-map provider unit enhancements and e2e tests. |
| sdk/cosmos/azure-cosmos/tests/routing/test_routing_map_provider.py | Adds SmartRoutingMapProvider end-to-end 503 conversion coverage in the routing integration suite. |
| sdk/cosmos/azure-cosmos/tests/routing/test_routing_map_provider_async.py | Refactors imports and adds SmartRoutingMapProvider async end-to-end tests; includes text updates in docstrings/comments. |
| sdk/cosmos/azure-cosmos/tests/test_response_decoding.py | Tightens/extends tests around env-var parsing and decode behavior; documentation updates. |
| sdk/cosmos/azure-cosmos/tests/test_response_decoding_paged.py | New tests for strict vs REPLACE/IGNORE decoding behavior across multiple sequential pages (sync + async request paths). |
| sdk/cosmos/azure-cosmos/tests/test_request_response_decoding.py | Simplifies docs and keeps wiring/behavior tests ensuring _Request uses shared decode helper and surfaces typed exceptions. |
| sdk/cosmos/azure-cosmos/tests/test_read_timeout_propagation.py | New sync tests to verify read_timeout propagation to transport kwargs across query entry points and precedence rules. |
| sdk/cosmos/azure-cosmos/tests/test_read_timeout_propagation_async.py | Async parity for read-timeout propagation assertions. |
| sdk/cosmos/azure-cosmos/tests/test_query_response_headers_async.py | Adds additional async response-header contract tests and a long-pagination bounded-memory test. |
| sdk/cosmos/azure-cosmos/tests/test_query_feed_range.py | Adds multi-partition SELECT VALUE shape/merge assertions for non-aggregate vs aggregate projections. |
| sdk/cosmos/azure-cosmos/tests/test_query_feed_range_async.py | Async parity for the multi-partition SELECT VALUE shape/merge assertions. |
| sdk/cosmos/azure-cosmos/tests/test_query_feed_range_multipartition_async.py | Adjusts async client creation to exercise AAD lane; adds pagination resume tests and safety-guard tests for stalled/overlap cases. |
| sdk/cosmos/azure-cosmos/tests/test_query_aggregate_utils_unit_async.py | New async-side unit tests ensuring shared aggregate classifier/merge behavior parity. |
| sdk/cosmos/azure-cosmos/tests/test_partition_split_retry_unit.py | Expands/cleans up unit tests around structured continuation tokens and removes noisy debug printing. |
| sdk/cosmos/azure-cosmos/tests/test_partition_split_retry_unit_async.py | Async parity for the continuation-token explode/resume and routing-lookup failure checkpoint behavior. |
| sdk/cosmos/azure-cosmos/tests/test_feed_range_continuation_token.py | Extends aggregate detection/merge tests (MIN/MAX case insensitivity, mixed numeric types, boolean classification guards). |
| sdk/cosmos/azure-cosmos/tests/test_encoding.py | Adds sync emoji/multibyte round-trip tests (point + query). |
| sdk/cosmos/azure-cosmos/tests/test_encoding_async.py | New async emoji/multibyte round-trip tests (point + query). |
| sdk/cosmos/azure-cosmos/tests/test_cosmos_paged_unit.py | New unit tests for sync paged wrappers (CosmosItemPaged, CosmosDict, CosmosList) response-header API semantics. |
| sdk/cosmos/azure-cosmos/tests/test_cosmos_paged_unit_async.py | New unit tests for async paged wrapper (CosmosAsyncItemPaged) and response-header API guards. |
| sdk/cosmos/azure-cosmos/tests/test_content_length_encoding.py | Simplifies docs and adds bytes/bytearray body behavior pinning in Content-Length tests (sync + async). |
| sdk/cosmos/azure-cosmos/tests/test_availability_strategy.py | Adds test ensuring per-request availability_strategy=None falls back to client-level hedging strategy (sync). |
| sdk/cosmos/azure-cosmos/tests/test_availability_strategy_async.py | Async parity for availability_strategy=None fallback behavior. |
| sdk/cosmos/azure-cosmos/tests/test_aio_extras_packaging_unit.py | New packaging metadata test ensuring aio extras are declared and async module imports. |
| sdk/cosmos/azure-cosmos/azure/cosmos/_location_cache.py | Docstring/comment tightening for location list cleaning and region-name normalization logic. |
|
✅ Review complete (56:08) Posted 6 inline comment(s). Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage |
|
@sdkReviewAgent-2 |
|
✅ Review complete (07:24) Posted 7 inline comment(s). Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage |
|
@sdkReviewAgent-2 |
|
✅ Review complete (07:57) Posted 1 inline comment(s). Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage |
|
/azp run python - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| The getter just reads a dict, so the tests do not need an event loop. | ||
| """ | ||
|
|
||
| def test_get_response_headers_is_empty_before_any_page_fetch(self): |
There was a problem hiding this comment.
None of the below tests are async, we should mark them with async and have proper end to end async validation.
There was a problem hiding this comment.
The method under test, get_response_headers(), is identical on both the sync and async pagers - both are a plain def returning self._response_headers.copy(). There is nothing to await inside it. So rewriting these unit tests as async def would add an event loop and ceremony that will still do the same in m emory copy.
this test pins one contract: if a client ask for headers before any page has been fetched, they get back an empty case-insensitive dict — not None, not an exception.
we already have end to end validation tests
test_query_response_headers_async.py uses IsolatedAsyncioTestCase, creates a real container, runs query_items, does real async for item in query_iterable (and by_page() iteration), and only then asserts on get_response_headers() - including multi-page, empty-result, concurrent, and per-page-overwrite cases.
test_cosmos_responses_async.py does the same shape for async point operations - await upsert_item, create_item, read_item, batch — then checks the returned headers
Please let me know if I am missed your point.
| @@ -215,19 +207,11 @@ async def run_test(): | |||
| asyncio.run(run_test()) | |||
|
|
|||
| def test_sync_inference_2xx_with_invalid_utf8_raises_decode_error(self): | |||
There was a problem hiding this comment.
🔴 Correctness — Strict-mode decode tests are environment-dependent
The test_sync_inference_2xx_with_invalid_utf8_raises_decode_error and its async counterpart expect DecodeError on malformed UTF-8, but they don't isolate AZURE_COSMOS_CHARSET_DECODER_ERROR_ACTION_ON_MALFORMED_INPUT. The setUp/tearDown only save/restore the inference endpoint env var, not the charset decoder env var.
If that env var is set to REPLACE or IGNORE in the process (e.g., by CI configuration, another test, or a developer's shell), decode_response_body_for_status() at _response_decoding.py:49-52 will decode permissively instead of raising, and assertRaises(DecodeError) will fail with DecodeError not raised — or worse, silently stop testing the strict-mode contract.
The REPLACE/IGNORE tests in the same file correctly save/restore the env var manually — but these strict-mode tests assume it's absent. The same issue exists in test_request_response_decoding.py:163.
Suggestion: Add env-var isolation to setUp/tearDown:
def setUp(self):
self._saved_endpoint = os.environ.get(_INFERENCE_ENDPOINT_ENV_VAR)
os.environ[_INFERENCE_ENDPOINT_ENV_VAR] = "https://example.com"
self._saved_malformed = os.environ.get(_MALFORMED_INPUT_ENV_VAR)
os.environ.pop(_MALFORMED_INPUT_ENV_VAR, None) # ensure strict mode
def tearDown(self):
# ... existing endpoint restore ...
if self._saved_malformed is not None:
os.environ[_MALFORMED_INPUT_ENV_VAR] = self._saved_malformed
else:
os.environ.pop(_MALFORMED_INPUT_ENV_VAR, None)| should be raised as DecodeError, keeping the real wire status and | ||
| the original cause attached.""" | ||
|
|
||
| def test_sync_2xx_with_invalid_utf8_raises_decode_error(self): |
There was a problem hiding this comment.
🔴 Correctness — Same env-var dependency as test_semantic_reranker_unit.py
This test expects DecodeError on a 200 response with invalid UTF-8 in default strict mode, but it doesn't clear AZURE_COSMOS_CHARSET_DECODER_ERROR_ACTION_ON_MALFORMED_INPUT. If that env var is set to REPLACE or IGNORE in the process, decode_response_body_for_status() decodes permissively and the assertRaises(DecodeError) assertion fails — or stops testing the strict-mode contract silently if the test runner swallows the failure.
The test_response_decoding.py file already solved this by adding an env-isolating base class. This file should do the same, or at minimum os.environ.pop("AZURE_COSMOS_CHARSET_DECODER_ERROR_ACTION_ON_MALFORMED_INPUT", None) in a setup method.
| # some flows; tolerate a ``None`` there but require | ||
| # at least one /docs/ call to carry the value, and | ||
| # any non-None /docs/ call must match it exactly. | ||
| non_none = [rt for rt in doc_timeouts if rt is not None] |
There was a problem hiding this comment.
🟡 Recommendation — Aggregate timeout assertion allows silent regression on page fetches 2+
The None-filtering here weakens the assertion compared to the non-aggregate sibling tests. Consider this scenario:
- A regression causes only the first
/docscall (the query-plan POST) to carryread_timeout. doc_timeouts = [22, None, None, None]— subsequent page fetches lost the timeout.non_none = [22]→len(non_none) > 0✅ →all(rt == 22 for rt in non_none)✅ — both assertions pass despite the bug.
The non-aggregate paging tests (test_read_timeout_propagates_through_by_page_paging) assert all(rt == timeout for rt in doc_timeouts) with no None filtering, which would catch this same regression.
The comment says the relaxation is for "the query-plan POST against /docs/" — but the query-plan POST and the page fetches can be distinguished by HTTP method (POST vs GET) or by a tighter URL filter.
Suggestion: Assert that at most one /docs call (the query plan) has None, and all the rest carry the expected value:
none_count = sum(1 for rt in doc_timeouts if rt is None)
self.assertLessEqual(none_count, 1, "More than one /docs call dropped read_timeout")
non_none = [rt for rt in doc_timeouts if rt is not None]
self.assertTrue(all(rt == client_timeout for rt in non_none))| # supported and the gateway rejects those queries. ``c.amount`` is | ||
| # used instead of ``c.value`` because ``VALUE`` is a SQL keyword. | ||
|
|
||
| async def test_read_timeout_propagates_through_by_page_value_count_aggregate_async(self): |
There was a problem hiding this comment.
🟡 Recommendation — Sync/async parity gap: enable_cross_partition_query=True omitted in all async aggregate tests
Every sync aggregate test in test_query.py explicitly passes enable_cross_partition_query=True, but the async counterparts in this file omit it entirely (zero occurrences across all new async tests). The same issue was raised and fixed for test_read_timeout_propagation_async.py — this file needs the same treatment.
While the async container auto-enables cross-partition queries when no partition_key is given, the explicit flag exercises a different code path (feed_options["enableCrossPartitionQuery"] = True set by the caller vs. auto-detected by the SDK). If a regression only affects the explicit-flag path in the async client, these tests would not catch it.
Suggestion: Add enable_cross_partition_query=True to all async cross-partition aggregate test queries, matching the sync counterparts.
| # The async client should call the same merge and counting helpers | ||
| # as the sync client, so identical behavior is guaranteed. | ||
| async def test_async_module_reuses_shared_classifier_and_merge_async(self): | ||
| assert _async_conn.base._merge_query_results is _base._merge_query_results |
There was a problem hiding this comment.
🟡 Recommendation — Identity check pins a private import alias, not behavior
assert _async_conn.base._merge_query_results is _base._merge_query_resultsThe attribute base only exists because _cosmos_client_connection_async.py:50 uses from .. import _base as base. If someone refactors this to from .. import _base (dropping the as base alias), this test raises AttributeError: module has no attribute 'base' — not because a behavior changed, but because an internal symbol name changed.
This is a false-positive risk: the test fails when the code is correct, and the error message won't point at a real bug.
Suggestion: Test the behavioral contract instead — e.g., verify that calling _merge_query_results from the async code path produces the same output as from the sync path. Or at minimum, access the module's _base attribute directly (_async_conn._base._merge_query_results) which matches the actual import name and is less fragile.
| finally: | ||
| self._delete_container_for_test(container.id) | ||
|
|
||
| def test_client_level_short_read_timeout_fails_on_by_page(self): |
There was a problem hiding this comment.
🟡 Recommendation — Sub-picosecond timeout is unreliable across environments
read_timeout=0.000000000001 is 1 picosecond. OS timer resolution on Linux is typically 1 µs–1 ms; Windows is ~15.6 ms. The kernel will round this up to its minimum granularity. On fast local networks (e.g., emulator on localhost), the HTTP round-trip can complete before the timer resolution kicks in, meaning no timeout fires and assertRaises raises AssertionError: CosmosClientTimeoutError not raised.
This pattern is inherently flaky — it depends on the request being slower than the OS timer minimum, which varies by machine and load.
Suggestion: Use a mock transport that injects a configurable delay (like the _CaptureTransport pattern elsewhere), or set read_timeout to something slightly larger but still practically guaranteed to expire (e.g., 0.0001 = 100 µs against a remote endpoint). Alternatively, if the intent is just to prove the kwarg reaches the wire, the _capture_pipeline_read_timeouts pattern already used above is more deterministic.
| for i in range(3): | ||
| await container.create_item({"id": f"item_{i}_{uuid.uuid4()}", "pk": "p", "data": i}) | ||
|
|
||
| async with CosmosClient(self.host, self.masterKey, read_timeout=0.000000000001) as short_client: |
There was a problem hiding this comment.
🟡 Recommendation — Async test hardcodes self.masterKey while sync uses self.credential
All async client-level timeout tests create the client with self.masterKey:
async with CosmosClient(self.host, self.masterKey, read_timeout=...) as ct_client:But the sync counterparts use self.credential:
with cosmos_client.CosmosClient(self.host, self.credential, read_timeout=...) as ct_client:self.credential can be either a master key or an AAD credential depending on the CI lane configuration. By hardcoding self.masterKey, the async tests bypass AAD token-refresh paths entirely. In AAD CI lanes, the sync tests exercise AAD + read_timeout interaction (where the timeout could fire during token acquisition), but the async tests never do.
Suggestion: Use self.credential (or test_config.TestConfig.create_data_client_async(read_timeout=...)) for consistency with the sync tests.
| from packaging.requirements import Requirement | ||
| from packaging.version import Version | ||
|
|
||
| from azure.cosmos.aio import CosmosClient # noqa: F401 |
There was a problem hiding this comment.
🟡 Recommendation — Module-scope import defeats the packaging test
This module-level import means if the aio extras are broken (e.g., aiohttp is missing), the entire test module fails during collection with an ImportError before test_aio_extras_declared_in_distribution_metadata can run and report a targeted failure.
The test's purpose is to validate the aio extra is declared correctly in package metadata. If the import is broken, the most useful outcome is the metadata test failing with a clear message like "aio extra not found in distribution metadata" — not a raw ImportError during collection that doesn't explain the root cause.
Suggestion: Move the import inside test_azure_cosmos_aio_module_imports():
def test_azure_cosmos_aio_module_imports(self):
"""Importing from azure.cosmos.aio should work when the aio extra is installed."""
from azure.cosmos.aio import CosmosClient # noqa: F401
self.assertIsNotNone(CosmosClient)This way the metadata test can run independently and report whether the package metadata is correct, even when the import itself is broken.
|
|
||
| # Collect transient setup objects so they don't show up as growth. | ||
| gc.collect() | ||
| tracemalloc.start() |
There was a problem hiding this comment.
🟢 Suggestion — Guard tracemalloc.start()/stop() against external tracing tools
The tracemalloc.stop() is now properly in a finally block (good), but start() is still called unconditionally. If an external tool (e.g., pytest-memray, a memory profiler, or another test) already has tracing active, this test's stop() will terminate their tracing session.
Python docs: tracemalloc.stop() stops all tracing regardless of who started it. start() when already active only updates nframe.
Suggestion:
was_tracing = tracemalloc.is_tracing()
if not was_tracing:
tracemalloc.start()
try:
snapshot_before = tracemalloc.take_snapshot()
# ... iteration ...
finally:
if not was_tracing:
tracemalloc.stop()Same pattern should apply to the async mirror.
|
|
||
| # Always return two children whose ranges equal the requested head | ||
| # so the loop never makes progress and the guard must trip. | ||
| async def _always_multi_overlap(_rid, feed_ranges, _opts): |
There was a problem hiding this comment.
🟢 Suggestion — Mock signature lacks **kwargs — fragile against future callers
async def _always_multi_overlap(_rid, feed_ranges, _opts):The real get_overlapping_ranges signature accepts **kwargs. If a future change adds a keyword argument to the call site (e.g., for tracing or retry context), this mock raises TypeError: _always_multi_overlap() got an unexpected keyword argument — a test failure caused by a correct production change.
The same pattern appears in the sync sibling test_query_feed_range_multipartition.py.
Suggestion: Add **_kwargs to both mocks:
async def _always_multi_overlap(_rid, feed_ranges, _opts, **_kwargs):|
✅ Review complete (57:48) Posted 10 inline comment(s). Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage |
| as its default — would compare equal to it and get silently excluded. | ||
| Stripping the bad inputs once, at the boundary, prevents that collision. | ||
| Used at every location-list entry point so blank inputs cannot accidentally | ||
| match real endpoints during later comparisons. |
There was a problem hiding this comment.
💬 Observation — Docstring trim removes the rationale for _clean_location_list's existence
The previous docstring explained why this function exists: _normalize_region_name(None) returns "", and unfiltered None/empty entries in exclusion lists would collide with "" in the by-endpoint lookup map, causing every endpoint that isn't found in the map to silently match the excluded entry. The new docstring says "so blank inputs cannot accidentally match real endpoints" — which is correct but doesn't explain the mechanism.
A future maintainer looking at this helper could reasonably conclude it's just defensive overcaution and remove it, re-introducing the silent exclusion bug where None entries in excluded_locations cause every non-mapped endpoint to be excluded.
Consider keeping a one-line hint about the _normalize_region_name(None) → "" collision:
"""Return the list with None, empty, and whitespace-only entries removed.
Used at every location-list entry point so blank inputs cannot accidentally
match real endpoints: _normalize_region_name(None) returns "", which would
collide with any missing endpoint lookup during later comparisons.
"""Not blocking — the behavior is correct and the code is clear on its own.
| iterator would pull one page at a time.""" | ||
|
|
||
| def _drive_pages(self, page_bodies): | ||
| """Calls the request function once per body and returns the |
There was a problem hiding this comment.
🟢 Suggestion — Tests exercise per-response decode isolation, not actual pager state
The module docstring says "paging through one response after another keeps working" and the test class names include "Paged", but _drive_pages() calls _synchronized_request._Request() (a single-response decode function) once per body with completely fresh mock args each time. No CosmosItemPaged, ItemPaged, or continuation-token state carries between calls.
This is still useful — it validates that the codec doesn't leak corrupt state across repeated _Request() invocations (the bad→good→bad 3-page sequence is particularly good). But a regression in pager-level continuation handling, iterator state, or response_headers lifecycle across pages would not be caught by these tests.
The test name could better reflect what it actually proves. Something like test_response_decoding_multi_call.py or adjusting the module docstring to say "calling the request function multiple times in sequence" rather than "paging" would prevent confusion.
|
✅ Review complete (04:50) Posted 2 inline comment(s). Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage |
enhancing python SDK test coverage